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1. Introduction and Motivation 

We start with an observable CI measured in the vacuum, (D.) = J 3tUe~ s W[det^(U)] a Q.(U)/Z, 
where the parameter a defines the number of fermion flavours the theory represents (a = N//4 for 
staggered and Nf/2 for Wilson). We choose to include the operator = M^M as opposed to just 
M (the Dirac operator) to allow us to update the pseudofermion fields using a heatbath. When a is 
non-integer, the conventional HMC algorithm fails, and this necessitates another algorithm choice 
if we are to include the effects due to the strange quark (or even if we want to simulate a two flavour 
staggered fermion theory). 

The most popular algorithm to date for performing such calculations has been the R algorithm 
(R) [jl|]. Here the fermionic determinant is rewritten in exponential-trace-logarithm form det^" = 
exp (atrln„#). When the equations of motion for this effective action are derived, it is found that 
the explicit matrix inverse is required. To circumvent this problem the trace is replaced by a noisy 
estimator, this results in a fermionic force term almost identical to that obtained from HMC where 
a pseudofermion formulation is used. The key difference is that in the former the force is merely 
a noisy estimator of the true fermion force, whereas in the latter the derived force is exactly the 
pseudofermion force. This means that the 0{8x) error term in the conserved Hamiltonian no 
longer cancels automatically, and to force this cancellation an irreversible and non-area preserving 
integrator must be used. Thus the algorithm cannot be made exact through the inclusion of a Monte 
Carlo acceptance test and any results generated using R will have finite 0{8x 2 ) errors. 

2. Rational Hybrid Monte Carlo 

The Rational Hybrid Monte Carlo Algorithm (RHMC) was designed to overcome the shortcom- 
ings of R, namely the inexact nature of the calculations (for a proper description of the algo- 
rithm see [Q], the description given here is sparse on details). The fermion determinant is rewrit- 
ten in pseudofermion formulation, and a rational approximation is used to represent the effec- 
tive matrix appearing in the bilinear term, det^# a = / tyfyQifye^-^ °^ / &^&<^e^^ r (-^00, 
with r{x) wx-"/ 2 l . If the error A on the rational approximation is made arbitrarily small, e.g., 
A(x) = |l -x a l 2 r(x) \ < 1(T 14 , then there can be no systematic bias induced from using a rational 
approximation and the conventional HMC algorithm can be used, but with the rational function 
used as the kernel in the bilinear. 

Any rational function can be written as a sum of partial fractions, r(x) = Ok/(* + ft)> 
in this form a multi-shift solver can be used to evaluate all shifts for approximately the same cost 
as the smallest shift (which is near zero). A lower degree approximation can be used for the MD 
evolution since any errors are corrected for stochastically when the accept/reject test is performed. 
This approximation is generally set to f « ^#~ a « r 2 to avoid the double inversion which would 
have arisen from using r 2 . The resulting pseudofermion force is written 

h 

Sf = - £ a,0 f {JC + ft) ~ 1 Jt' {Jt + ft) ~ V . (2.1) 

'This seemingly perverse choice of including r 2 in the action is to allow for heatbath evaluation of the pseudo- 
fermion fields. 
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Hence the cost of the algorithm is similar to HMC in that it only requires one conjugate gradient 
inversion per MD step. 



3. Finite temperature QCD 

The comparison of the algorithms' performance near the chiral transition point is of particular 
relevance. In this regime it has been shown that the exact location of the transition point j8 c can 
be strongly affected by finite step-size errors |3[]. The aim of our study was to see how RHMC 
behaves in this regime, and to compare the cost against using R. The parameters we used are given 
in Table [l[ In Figures [I] and ^ plots of the variation of the plaquette and of yy are given: only for 
the smaller step-size is R consistent with RHMC. 
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Figure 1: The plaquette behaviour near the chiral Figure 2: The yy behaviour near the chiral transi- 
transition point (parameters given in Table |l]). tion point (parameters given in Table |l]). 
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Table 1: Binder cumulant of (iff iff), B4, and 
RHMC acceptance rate A from the finite 
temperature study (2+1 flavour naive stag- 
gered fermions, Wilson gauge action, V = 
8 3 x 4, m ui = 0.0076, m s = 0.25, T = 1.0) 
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Table 2: The different masses at which the domain wall re- 
sults were gathered, together with the step-sizes 5t, acceptance 
rates A and plaquettes P (V — 16 3 x 32 x 8, DBW2 gauge ac- 
tion, J3 = 0.72). 



On each configuration the Binder cumulant was measured. The cumulant is important because 
it probes the strength of the phase transition. Only when R is using an integration step-size of 
8z < m ud /4 is the step-size sufficiently small that the finite step-size errors are negligible (see 
Table [j]). The correct value is obtained with RHMC using a step-size as 29 times larger: this 
represents a considerable improvement in efficiency. 
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4. 2+1 Domain Wall Fermions 



Again a comparison of R and RHMC was performed, but now using domain wall fermions with a 
more realistic volume (the parameters from this study are presented in Table Applying RHMC 
to the case of domain wall fermions is not as trivial as for staggered fermions. The one flavour 

7am m T he 



domain wall fermion determinant is given by det y m\ v M vv j det y M PW M PW = det 

additional Pauli-Villars bosonic determinant is required to cancel the heavy modes appearing in the 

bulk of the fifth dimension. Unfortunately, we cannot use r(^# DWF ) as this would lead to a nested 

1/2 / * \ -'A 



inversion in the solver. Therefore, the action is written S F = <p [M P \M PW j <p +% ^M PF M PF j %, 
and now each matrix kernel can be written as a rational approximation. Thus we require 2 fermion 
fields to simulate a single flavour contribution. The naive additional cost of this formulation is 
small (nipv 3> m nA ), but is inherently more noisy because the heavy mode cancellation is only done 
stochastically. This results in larger forces, and a smaller step-size will be required than if the 
cancellation was exact [Q] (the resolution to this problem shall be presented in § |5.2[ ). The R algo- 
rithm also uses stochastic cancellation, the bosonic Pauli-Villars field is included through the use 
of negative flavour number. 

The step-size dependence of the plaquette is shown in Figure ||, from this extrapolation it is 
clear that to obtain a consistent result between the algorithms requires that R use an integration 
step-size at least 10 times smaller than RHMC. 
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Figure 3: The step-size variation of the plaquette 
(m ud = 0.02, additional parameters given in TableH). 




Figure 4: The integrated autocorrelation time of the 
13* time-slice of the pion propagator (m ai — 0.04, 



parameters given in Table [ 



No algorithm comparison would be complete without a comparison of autocorrelations. In 
this study, both the plaquette and pion integrated autocorrelation times were measured, and it was 
found that there was very little to distinguish the two algorithms (see, e.g., Figure^). 



5. Improving RHMC 
5.1 Integration Scheme 

It has recently been shown [gj], that the optimal second order symmetric symplectic integrator is 
not the leapfrog integrator, rather it is that given by Omelyan et al [Q], which is given by 

U qpqpq (8t) = e lSrQ e SxP ' 2 e^ 2X ^ Q e STP ' 2 e lS * Q , 
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where Q and P represent the coordinate and momenta update operators respectively and A is a 
tuneable parameter whose optimal value is found to be A as 0.1931. The Omelyan integrator is 
approximately double the cost of leapfrog, but theoretically should lead to a 3 fold improvement in 
conservation of the Hamiltonian: a net 50% gain. This integration scheme can of course be used 
with RHMC, and leads to a near 40% improvement in acceptance rate (see Table ^]). 
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Table 3: A comparison of the leapfrog and 
Omelyan integrators, using domain wall fermions 
(V = 16 3 x 32 x 8, Iwasaki gauge action, /3 = 2.13, 
m ud = m s /2). 
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Figure 5: DWF Force magnitude (j3 = 2.13, 
m ud /m s = 0.25, V = 24 3 x 64 x 16) 



5.2 Exact Heavy Mode Cancellation 

In the results presented in §|], the RHMC domain wall calculations were performed using only a 
stochastic cancellation of the heavy modes, and it was noted that this leads to larger fermion forces, 
hence a more costly algorithm. Since that study was conducted, a resolution to this problem has 
been found. The one flavour DWF determinant can be rewritten as 



/ detMiMpp = ^ f vMpv) - l/8(M 1 pMpF) V4 (M f vMpy )-vf , 
V detMp T v Mpv 

and the resulting pseudofermion action is given by 

S F = [(Mp t v Mp V ) 1/4 (Mp t F Mp F )- 1/2 (M p t v M PV ) 1/4 ] 2 = [r 1 (Mp t v M PV )r 2 (Mp t F Mp F )n(Mp t v M PV )] 2 0; 



where r\ (x) m x [/4 and r%(x) x~ 1/2 . The kernel that appears in the bilinear term is now in a form 
that allows heatbath evaluation (it is the square of a real positive operator) and a multi-shift solver 
can be used to evaluate each rational function that appears in the action. At each step of the MD 
trajectory three inversions are required compared to two for the stochastic formulation presented in 
§|[ Since two of these inversions are using the Pauli-Villars matrix the cost of the extra inversion 
is negligible. This formulation should allow for large increases in the integration step-size. This 
shall be the subject of further study. 

5.3 Fermion force tuning 



The RHMC fermionic force given in equation (2.1) is just a sum of HMC-like force terms, each 



with different magnitude. Figure || is a plot showing the magnitude of the force associated with 
each shift. Also included is the number of conjugate gradient iterations required by the solver for 
each shift. The key point is that the most expensive shifts are also those which contribute least to 
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the total fermion force. Since the acceptance rate is determined by the quantity F8t, this suggests 
that the small shifts can use a larger integration step-size than the large shifts. Hence, we could 
split the partial fractions into multiple timescales and use a multi-timescale integration scheme. An 
alternative strategy consists of relaxing the convergence of the multi-shift solver for the smaller 
shifts, but maintaining tight convergence for the larger shifts. An initial study into how much 
improvement can be gained from these strategies is currently being undertaken, it would appear 
that at least a factor of two can be gained, subject to further tuning. 

6. Conclusions and outlook 

In this work we have presented the exact RHMC algorithm and compared it to the inexact R algo- 
rithm in two regimes, namely near the chiral transition point of QCD using small volumes, and at 
low temperature using more realistic volumes. In both regimes, it was found that R must be run 
using a much smaller step-size than RHMC to achieve consistency between the two algorithms. 
Extrapolation of the R results to zero step-size is also significantly more expensive than just using 
RHMC. The conclusion therefore, is that there is no need for further use of the R algorithm. 

Various improvements to the standard RHMC formulation were presented which lead to fur- 
ther performance improvements. It would be interesting to compare this improved RHMC to other 
exact algorithms, indeed this shall be the subject of future study. 
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